319 research outputs found

    Efficient Privacy-preserving Whole-Genome Variant Queries

    Get PDF
    MOTIVATION: Diagnosis and treatment decisions on genomic data have become widespread as the cost of genome sequencing decreases gradually. In this context, disease–gene association studies are of great importance. However, genomic data are very sensitive when compared to other data types and contains information about individuals and their relatives. Many studies have shown that this information can be obtained from the query-response pairs on genomic databases. In this work, we propose a method that uses secure multi-party computation to query genomic databases in a privacy-protected manner. The proposed solution privately outsources genomic data from arbitrarily many sources to the two non-colluding proxies and allows genomic databases to be safely stored in semi-honest cloud environments. It provides data privacy, query privacy and output privacy by using XOR-based sharing and unlike previous solutions, it allows queries to run efficiently on hundreds of thousands of genomic data. RESULTS: We measure the performance of our solution with parameters similar to real-world applications. It is possible to query a genomic database with 3 000 000 variants with five genomic query predicates under 400 ms. Querying 1 048 576 genomes, each containing 1 000 000 variants, for the presence of five different query variants can be achieved approximately in 6 min with a small amount of dedicated hardware and connectivity. These execution times are in the right range to enable real-world applications in medical research and healthcare. Unlike previous studies, it is possible to query multiple databases with response times fast enough for practical application. To the best of our knowledge, this is the first solution that provides this performance for querying large-scale genomic data. AVAILABILITY AND IMPLEMENTATION: https://gitlab.com/DIFUTURE/privacy-preserving-variant-queries. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    The erosion of nongambling spheres by smartphone gambling: a qualitative study on workplace and domestic disordered gambling

    Get PDF
    The potential dangers of internet-based gambling as compared with more traditional land-based gambling have been increasingly investigated over the past decade. The general consensus appears to be that although internet gambling might not be a more dangerous medium for gambling per se, the 24/7 availability it generates for problem gamblers, however, is. Because smartphones have become the most used way of gambling online, internet gambling must, therefore, be further subcategorized according to the device by which it is accessed. This study examines the issue by exploring the views of smartphone gamblers undergoing treatment for gambling disorder in focus group settings (N=35). Utilizing thematic analysis, the paper shows that smartphone gambling has colonized spaces previously regarded as nongambling spheres. The workplace, especially in male-dominated contexts, emerged as an accommodator and stimulator of gambling behavior, raising issues of productivity rather than criminality. Domestic gambling was mostly characterized by an invasion of bathroom and bedtime spheres of intimacy. The study examines the implications of prevention and treatment, focusing on the minimization of exposure to gambling stimuli, the erosion of intimacy that recovering gamblers must endure, and the necessity of embracing a broader definition of gambling-related harm

    Evaluation of peak-picking algorithms for protein mass spectrometry

    Get PDF
    Peak picking is an early key step in MS data analysis. We compare three commonly used approaches to peak picking and discuss their merits by means of statistical analysis. Methods investigated encompass signal-to-noise ratio, continuous wavelet transform, and a correlation-based approach using a Gaussian template. Functionality of the three methods is illustrated and discussed in a practical context using a mass spectral data set created with MALDI-TOF technology. Sensitivity and specificity are investigated using a manually defined reference set of peaks. As an additional criterion, the robustness of the three methods is assessed by a perturbation analysis and illustrated using ROC curves

    FRED—a framework for T-cell epitope detection

    Get PDF
    Summary: Over the last decade, immunoinformatics has made significant progress. Computational approaches, in particular the prediction of T-cell epitopes using machine learning methods, are at the core of modern vaccine design. Large-scale analyses and the integration or comparison of different methods become increasingly important. We have developed FRED, an extendable, open source software framework for key tasks in immunoinformatics. In this, its first version, FRED offers easily accessible prediction methods for MHC binding and antigen processing as well as general infrastructure for the handling of antigen sequence data and epitopes. FRED is implemented in Python in a modular way and allows the integration of external methods

    Charting a Dynamic DNA Methylation Landscape of the Human Genome

    No full text
    DNA methylation is a defining feature of mammalian cellular identity and essential for normal development(1,2). Most cell types, except germ cells and pre-implantation embryos(3–5), display relatively stable DNA methylation patterns with 70–80% of all CpGs being methylated(6). Despite recent advances we still have a too limited understanding of when, where and how many CpGs participate in genomic regulation. Here we report the in depth analysis of 42 whole genome bisulfite sequencing (WGBS) data sets across 30 diverse human cell and tissue types. We observe dynamic regulation for only 21.8% of autosomal CpGs within a normal developmental context, a majority of which are distal to transcription start sites. These dynamic CpGs co-localize with gene regulatory elements, particularly enhancers and transcription factor binding sites (TFBS), which allow identification of key lineage specific regulators. In addition, differentially methylated regions (DMRs) often harbor SNPs associated with cell type related diseases as determined by GWAS. The results also highlight the general inefficiency of WGBS as 70–80% of the sequencing reads across these data sets provided little or no relevant information regarding CpG methylation. To further demonstrate the utility of our DMR set, we use it to classify unknown samples and identify representative signature regions that recapitulate major DNA methylation dynamics. In summary, although in theory every CpG can change its methylation state, our results suggest that only a fraction does so as part of coordinated regulatory programs. Therefore our selected DMRs can serve as a starting point to help guide novel, more effective reduced representation approaches to capture the most informative fraction of CpGs as well as further pinpoint putative regulatory elements

    BALL - biochemical algorithms library 1.3

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Biochemical Algorithms Library (BALL) is a comprehensive rapid application development framework for structural bioinformatics. It provides an extensive C++ class library of data structures and algorithms for molecular modeling and structural bioinformatics. Using BALL as a programming toolbox does not only allow to greatly reduce application development times but also helps in ensuring stability and correctness by avoiding the error-prone reimplementation of complex algorithms and replacing them with calls into the library that has been well-tested by a large number of developers. In the ten years since its original publication, BALL has seen a substantial increase in functionality and numerous other improvements.</p> <p>Results</p> <p>Here, we discuss BALL's current functionality and highlight the key additions and improvements: support for additional file formats, molecular edit-functionality, new molecular mechanics force fields, novel energy minimization techniques, docking algorithms, and support for cheminformatics.</p> <p>Conclusions</p> <p>BALL is available for all major operating systems, including Linux, Windows, and MacOS X. It is available free of charge under the Lesser GNU Public License (LPGL). Parts of the code are distributed under the GNU Public License (GPL). BALL is available as source code and binary packages from the project web site at <url>http://www.ball-project.org</url>. Recently, it has been accepted into the debian project; integration into further distributions is currently pursued.</p

    Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset

    Get PDF
    Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics
    corecore